Notes 11/11: * figure out dcast issues * replacing NA/0s within pipe
Need to determine appropriate size range for comparison. Because they are not evenly distributed (i.e. larger fish in Bonaire, smaller in Barbuda), I will likely want to compare length-feeding relationships as opposed to pooled averages
Potential predictor variables are site-level fish, benthic, and rugosity values. These are likely correlated to one another, and I need to determine which ones I ultimately want to use (if modeling behavioral responses via any multivariate regressions). I can also move to SEM if I want to keep multiple correlated predictors.
First, check distribution of predictor variables of interest: not very normally distributed…
Variable selection notes: - excluding both carnivore variables as they are highly correlated with scarid biomass and total biomass, eventually I could make these more nuanced by distinguishing actual predators, but right now I don’t think it reflects actual predator populations of >15cm parrotfish - rugosity is highly correlated with turf cover, and scarid density - scarid density: removing for now, because I think it was a bit skewed from Barbuda juveniles - could eventually use consp. scarid length as another indicator of overfishing?
## Importance of components:
## PC1 PC2 PC3 PC4 PC5 PC6
## Standard deviation 1.694 1.5167 0.66913 0.50517 0.30585 0.18692
## Proportion of Variance 0.478 0.3834 0.07462 0.04253 0.01559 0.00582
## Cumulative Proportion 0.478 0.8614 0.93605 0.97859 0.99418 1.00000
Fish-level grazing behaviors (as well as competitive interaction frequency)
Variable selection notes: - for_bites is correlated with fr and for_dur, but I will play around with keeping it for now.
## Df Sum Sq Mean Sq F value Pr(>F)
## island 2 19329868 9664934 24.48 5.08e-09 ***
## Residuals 80 31586112 394826
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## Tukey multiple comparisons of means
## 95% family-wise confidence level
##
## Fit: aov(formula = fr ~ island, data = vet)
##
## $island
## diff lwr upr p adj
## Antigua-Bonaire -936.8111 -1387.7231 -485.8992 0.0000115
## Barbuda-Bonaire -1058.3667 -1486.4053 -630.3282 0.0000002
## Barbuda-Antigua -121.5556 -670.7081 427.5968 0.8575559
## Levene's Test for Homogeneity of Variance (center = median)
## Df F value Pr(>F)
## group 2 0.731 0.4846
## 80
## Df Sum Sq Mean Sq F value Pr(>F)
## island 2 3522230 1761115 26.9 5.52e-10 ***
## Residuals 95 6218718 65460
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## Tukey multiple comparisons of means
## 95% family-wise confidence level
##
## Fit: aov(formula = fr ~ island, data = vir, white.adjust = T)
##
## $island
## diff lwr upr p adj
## Antigua-Bonaire -408.89527 -550.1969 -267.59360 0.0000000
## Barbuda-Bonaire -70.05115 -234.5194 94.41711 0.5698132
## Barbuda-Antigua 338.84411 180.1422 497.54604 0.0000055
## Levene's Test for Homogeneity of Variance (center = median)
## Df F value Pr(>F)
## group 2 0.9125 0.405
## 95
## Df Sum Sq Mean Sq F value Pr(>F)
## island 2 1.582 0.7911 22.98 1.3e-08 ***
## Residuals 80 2.753 0.0344
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## Tukey multiple comparisons of means
## 95% family-wise confidence level
##
## Fit: aov(formula = g_frac ~ island, data = vet, white.adjust = T)
##
## $island
## diff lwr upr p adj
## Antigua-Bonaire -0.27575852 -0.4088858 -0.1426312 0.0000122
## Barbuda-Bonaire -0.29697818 -0.4233524 -0.1706040 0.0000008
## Barbuda-Antigua -0.02121967 -0.1833515 0.1409122 0.9476109
## Levene's Test for Homogeneity of Variance (center = median)
## Df F value Pr(>F)
## group 2 0.1837 0.8325
## 80
## Df Sum Sq Mean Sq F value Pr(>F)
## island 2 2.808 1.4040 26.13 9.1e-10 ***
## Residuals 95 5.105 0.0537
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## Tukey multiple comparisons of means
## 95% family-wise confidence level
##
## Fit: aov(formula = g_frac ~ island, data = vir, white.adjust = T)
##
## $island
## diff lwr upr p adj
## Antigua-Bonaire -0.36517814 -0.4932046 -0.23715172 0.0000000
## Barbuda-Bonaire -0.06284811 -0.2118646 0.08616841 0.5760464
## Barbuda-Antigua 0.30233003 0.1585381 0.44612197 0.0000076
## Levene's Test for Homogeneity of Variance (center = median)
## Df F value Pr(>F)
## group 2 0.7501 0.4751
## 95
## Df Sum Sq Mean Sq F value Pr(>F)
## island 2 0.314 0.1571 1.448 0.241
## Residuals 80 8.676 0.1085
## Tukey multiple comparisons of means
## 95% family-wise confidence level
##
## Fit: aov(formula = br ~ island, data = vet)
##
## $island
## diff lwr upr p adj
## Antigua-Bonaire -0.123227748 -0.3595491 0.11309360 0.4303229
## Barbuda-Bonaire -0.132053164 -0.3563867 0.09228034 0.3427971
## Barbuda-Antigua -0.008825416 -0.2966343 0.27898343 0.9970480
## Levene's Test for Homogeneity of Variance (center = median)
## Df F value Pr(>F)
## group 2 10.408 9.601e-05 ***
## 80
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## Df Sum Sq Mean Sq F value Pr(>F)
## island 2 0.045 0.02253 0.502 0.607
## Residuals 95 4.267 0.04491
## Tukey multiple comparisons of means
## 95% family-wise confidence level
##
## Fit: aov(formula = br ~ island, data = vir, white.adjust = T)
##
## $island
## diff lwr upr p adj
## Antigua-Bonaire 0.045754540 -0.07128921 0.1627983 0.6222646
## Barbuda-Bonaire 0.043704156 -0.09252906 0.1799374 0.7260125
## Barbuda-Antigua -0.002050384 -0.13350720 0.1294064 0.9992399
## Levene's Test for Homogeneity of Variance (center = median)
## Df F value Pr(>F)
## group 2 10.63 6.823e-05 ***
## 95
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## Df Sum Sq Mean Sq F value Pr(>F)
## island 2 1457 728.5 7.298 0.00126 **
## Residuals 76 7586 99.8
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 4 observations deleted due to missingness
## Tukey multiple comparisons of means
## 95% family-wise confidence level
##
## Fit: aov(formula = for_bites ~ island, data = vet)
##
## $island
## diff lwr upr p adj
## Antigua-Bonaire -9.09956631 -16.491319 -1.707813 0.0118676
## Barbuda-Bonaire -9.17905349 -16.570807 -1.787300 0.0110379
## Barbuda-Antigua -0.07948718 -9.447089 9.288114 0.9997732
## Levene's Test for Homogeneity of Variance (center = median)
## Df F value Pr(>F)
## group 2 2.1906 0.1189
## 76
## Df Sum Sq Mean Sq F value Pr(>F)
## island 2 903.1 451.5 13.19 9.77e-06 ***
## Residuals 88 3012.5 34.2
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 7 observations deleted due to missingness
## Tukey multiple comparisons of means
## 95% family-wise confidence level
##
## Fit: aov(formula = for_bites ~ island, data = vir, white.adjust = T)
##
## $island
## diff lwr upr p adj
## Antigua-Bonaire -7.291484 -10.6759972 -3.90697129 0.0000050
## Barbuda-Bonaire -3.692020 -7.4808559 0.09681671 0.0578051
## Barbuda-Antigua 3.599465 -0.1446468 7.34357613 0.0621841
## Levene's Test for Homogeneity of Variance (center = median)
## Df F value Pr(>F)
## group 2 5.0607 0.00831 **
## 88
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Note: remove G1 grazing instances here?
Notes: - scarid biomass is not the best predictor once I account for differences between my samples in terms of the sizes of fish I was sampling. I think the grazing/length relationships are much stronger.
- restraining sample size to length windows lowers sample size and makes trends much less pronounced - esp. for phase differences
- reducing sample to individual phase only also blurs trends
## Linear mixed-effects model fit by REML
## Data: filter(sum_id_pca1, species_code != "rbp")
## AIC BIC logLik
## 9322.943 9358.316 -4653.471
##
## Random effects:
## Formula: ~1 | island
## (Intercept) Residual
## StdDev: 442.1831 450.4068
##
## Fixed effects: fr ~ phase + length_cm + species + pc1 + pc2
## Value Std.Error DF t-value p-value
## (Intercept) 1326.6141 270.99370 613 4.895369 0.0000
## phaset -163.0290 50.24074 613 -3.244957 0.0012
## length_cm -9.1235 3.52618 613 -2.587367 0.0099
## speciesSparisoma viride -623.1907 37.04353 613 -16.823201 0.0000
## pc1 89.0698 44.32152 613 2.009628 0.0449
## pc2 8.7783 31.20623 613 0.281299 0.7786
## Correlation:
## (Intr) phaset lngth_ spcsSv pc1
## phaset 0.168
## length_cm -0.307 -0.651
## speciesSparisoma viride -0.105 -0.067 0.085
## pc1 0.037 0.050 0.021 0.008
## pc2 0.046 0.007 -0.012 0.003 0.114
##
## Standardized Within-Group Residuals:
## Min Q1 Med Q3 Max
## -2.89990834 -0.56004163 -0.01254647 0.53166747 5.13083127
##
## Number of Observations: 621
## Number of Groups: 3
## phase length_cm species pc1 pc2
## 1.749662 1.751095 1.007674 1.021045 1.013411
##
## Family: gaussian
## Link function: identity
##
## Formula:
## fr ~ species + phase + s(length_cm) + s(scar_bm) + s(pc1) + s(pc2)
##
## Parametric coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 1088.72 27.46 39.652 < 2e-16 ***
## speciesSparisoma aurofrenatum -508.69 50.10 -10.153 < 2e-16 ***
## speciesSparisoma viride -618.17 33.37 -18.526 < 2e-16 ***
## phaset -111.30 39.36 -2.828 0.00481 **
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Approximate significance of smooth terms:
## edf Ref.df F p-value
## s(length_cm) 1.000 1.000 16.207 6.22e-05 ***
## s(scar_bm) 1.006 1.009 2.544 0.11121
## s(pc1) 3.481 4.039 33.161 < 2e-16 ***
## s(pc2) 3.814 4.375 3.524 0.00476 **
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## R-sq.(adj) = 0.513 Deviance explained = 52.1%
## -REML = 5772.3 Scale est. = 1.627e+05 n = 782
To Do as of Nov. 7
* boosted regression trees ecosphere 2017 adrians paper
* species as random effect